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Abstract 

This study was intended to serve as an example of 
cross-validating results from student persistence 
prediction models that employed commonly available 
pre-college student characteristics. The study investigated 
whether the accuracy of predicting student persistence 
would vary because of the use of present-year vs. 
previous-year parameters on present-year data, and 
whether the same set of predictors would change in 
terms of predictive efficiency between years. Institutions 
with selective, liberal, and open admissions policies had 
consistent persistence prediction odds ratios across time 
regardless of the data on which model parameters were 
generated. As expected, all institutions demonstrated 
lower accuracy rates at higher persistence criteria, and 
accuracy rates differed between different sets of model 
parameters. The ACT Composite scores emerged as 
consistently stable, significant predictors of persistence. 
The paper concludes with a discussion of procedural 
issues to consider when using logistic regression to predict 
persistence. 

Introduction 

Many issues weigh upon the minds of administrators 
at higher education institutions. One of the more 
prominent among these issues is student persistence. 
According to Brawer (1 996), approximately 50 percent of 
the freshmen enrolled in colleges and universities do not 
finish their degrees. However, many college counselors 
and administrators see persistence as a fundamental 
indicator of student success in postsecondary education 
(Kern, Fagley, & Miller, 1998). With such a large 
percentage of students failing to persist to graduation, it 
comes as no surprise to see an increase in efforts to 
identify factors related to student persistence. 



While some institutions have lower persistence rates 
than others, facilitating student persistence is valuable 
for both institutions and students. For instance, Tinto 
(1993) noted that reduction in financial resources from 
external funding agencies (e.g., endowments, state 
government, etc.) is made more problematic by loss of 
income because of student non-persistence. For students 
who leave campus before graduation, resources used for 
recruitment, orientation, and support services are rarely 
recovered. Tinto also indicated that in extreme cases, 
declining enrollments and high non-persistence rates 
could lead to the collapse of an institution. 

From a student’s perspective, facilitation of persistence 
is also important. Were a student to terminate his/her 
postsecondary education before obtaining a degree, the 
student is more likely to experience a loss of future 
income, as well as experience higher levels of frustration 
and lower self-esteem (Tinto, 1993). Furthermore, 
students who persist to graduation often have easier 
access to prestigious positions in society and experience 
more societal rewards (Tinto). 

No doubt, institutions and students will generally benefit 
from student persistence to graduation. However, if an 
institution is to play an active role in influencing students 
to remain enrolled, the institution needs to identify students 
who will benefit from interventions known to have a positive 
impact on persistence (Levitz & Noel, 1 985). Furthermore, 
many interventions need to be in place from the time 
students arrive on campus (Sadler, Cohen, & Kockesen, 
1 997). This need translates into a common purpose for 
persistence studies: to develop an early warning system 
designed to identify students at-risk for non-persistence 
(Lenning, 1982). 
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With a goal of early intervention in mind, identification/ 
prediction of incoming freshmen at-risk for non- 
persistence is necessarily based on pre-college data. 
Because institutions cannot know whether students 
actually will persist or not before coming to campus, the 
only recourse for predicting persistence is to use a model 
with pre-college data from previous year(s). Doing so 
assumes that the predictive model, generated on a 
different group of students, is applicable to the current 
cohort. With the changing characteristics of today’s 
college student populations (e.g., age, financial assistance, 
race/ethnicity), such an assumption may not be tenable. 
If the assumption is false, an institution’s predictions will 
be inaccurate, and resources earmarked for programs 
intended to improve persistence will not be spent on students 
who really need them. Therefore, the purpose of this paper 
was to conduct a cross-validation study that investigated 
the stability of .persistence prediction models between two 
consecutive incoming postsecondary student cohorts. 

Literature Review 

It is tempting for researchers to look for common 
indicators of student persistence that apply across a 
range of institutions. However, the range of campus 
environments and sub-environments that exist make such 
research prone to overlooking important predictors within 
specific institutions. Though this study used the same 
model at all institutions, analyses were conducted within 
four individual institutions. The following review of literature 
will briefly discuss factors commonly associated with 
student persistence found at many institutions, predictors 
identified by researchers within specific institutions, and 
an approach using this information to identify students at- 
risk for non-persistence. 

Predictors of Persistence Across Institutions 

A number of factors have emerged as predictors of 
year-to-year persistence across multiple institutions. For 
instance, Horn and Carroll (1998) found that students 
who left college before their second year and never 
returned tended to be older, have children, and worked 
full-time relative to students who returned. Other common 
factors shown to be related to student persistence include 
tuition and debt load (Cofer & Somers, 1998), behavioral 
intentions, general attitudes toward higher education, 
social and academic integration, and student/institutional 
fit (Cabrera, Castaneda, Nora, & Hengster, 1992; Tinto, 
1997; Tinto, 1993). 

College GPA has also consistently shown to have a 
strong relationship with persistence (Braunstein, McGrath, 
& Pescatrice, 2000; Gillespie & Noble, 1992; Johnson & 
Molnar, 1 996; Kern, et al., 1998; Tinto, 1993). Though a 
powerful predictor of student persistence, college GPAs 
are unavailable for incoming freshmen. Thus, researchers 
often must take advantage of the relationship between 



college GPA, high school GPA, and standardized test 
scores by using pre-college academic variables instead 
of college GPA to predict persistence (ACT, 1997). 

Within-lnstitution Predictors of Persistence 

Many studies of persistence focus on results from 
within individual institutions. For instance, certain student 
personality types, self-efficacy, empathy, and physical 
fitness, as well as dissatisfaction “over the mismatch 
between their expectations and their experiences at the 
institution” (Zhang & RiCharde, 1998; p. 6) were 
significantly related to freshman student persistence. Kern 
et al. (1998) found that ACT scores, information 
processing, selecting main ideas, self-testing, and the 
composite of motivation, time management, and 
concentration had indirect effects on non-persistence 
through college GPA. This last set of findings was 
considered most important for two reasons. First, many 
of these skills and attitudes can be taught. Second, the 
influence of these variables on GPA can be investigated, 
followed by the influence of GPA on persistence (Kern et 
al., 1998). 

Johnson and Molnar (1996) found that the odds of 
persistence for Black students were 50 percent greater 
than for other groups, after controlling for other academic 
and social variables. Other findings indicated that pre- 
enrollment variables (i.e., high school GPA, ACT 
Mathematics scores, etc.) and post-enrollment variables 
(i.e., satisfaction with major, expected vs. actual grades, 
etc.) could be used to identify student and institutional 
characteristics related to college student persistence 
(Gillespie & Noble, 1992). 

Practical Application of Persistence Research 

The studies cited above provided lists of variables 
potentially important to consider when studying persistence. 
However, the great diversity of environments and sub- 
environments within postsecondary institutions renders 
within-institution persistence analyses nearly a necessity. 

An example of a within-institution study was Nichols, 
Orehovec, and Ingold (1 998), who discussed using some 
of the variables above with logistic regression to identify 
incoming freshmen who were at-risk for non-persistence. 
Nichols, et al. found that by using estimated conditional 
probabilities of success based upon models developed 
on their 1 993 cohort, they could predict with reasonable 
accuracy the students who were non-persistence prone 
in their 1995 cohort. Students who did not meet the 
probabilistic criterion were considered at-risk, and were 
subsequently provided with additional services or 
resources (e.g., academic advising or counseling) 
designed to improve the probability of persisting. 

A similar approach was demonstrated in Sadler, et al. 
(1 997), who applied logistic regression to identify students 
at-risk for not persisting into their second year of college 
using five different criterion levels of estimated conditional 
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probabilities of persisting. Staff would identify and contact 
at-risk students, and subsequent services would ideally 
meet whatever student needs that may exist on a one-to- 
one basis. The intent of the plan was to improve 
persistence through interaction with these at-risk students 
at multiple points in time (e.g., prior to the start of the fall 
semester, after the third week of classes, at the end of 
the first semester). 

Though statistical identification of at-risk students 
exemplified in Nichols, et al. (1998) and Sadler et al. 
(1 997) holds promise, there are some potential limitations 
associated with using statistical identification alone. These 
limitations include (but may not be restricted to) the 
following: 

1 . Highly unequal percentages of persisters and non- 
persisters will impact logistic regression parameter 
estimation. Therefore, results may be substantially 
affected by an institution’s overall persistence rate. 

2. Logistic regression results can be very complicated, 
but both studies address this problem by making 
the results user-friendly in terms of probabilities. 
However, in-house staff may not have the statistical 
savvy required to obtain and fully explain the results 
to colleagues. 

3. Predictive models obtained from students in one 
year are often assumed to validly predict persistence 
for subsequent cohorts of incoming freshmen to a 
similar degree of accuracy. 

A solution to the first limitation would be to augment 
statistical logistic regression results with other types of 
information (e.g., focus groups, contact with residence 
hall assistants (RAs), etc.) when the ratio of persisters to 
non-persisters is relatively high, and the second limitation 
can be overcome by staff development or hiring practices. 
The final potential limitation does not have an easy solution. 
Should predictors (and their associated model) from one 
year do a poor job of predicting persistence of a subsequent 
cohort, identification of incoming freshmen at-risk for leaving 
campus would be based upon faulty information. The third 
limitation was the focus of this study. 

Methodology 

Participants 

Student data came from institutions selected on the 
basis of admissions selectivity, because research has 
shown that the persistence rates tend to vary as a function 
of institutional selectivity (Tinto, 1 993). This difference in 
persistence rates could affect the functioning of the logistic 
modeling; hence stratification on selectivity was intended 
to control for this effect. The schools used for this study 
were randomly selected, within selectivity classification, 



from a group of 24 that participated in the ACT Retention 
Service for both the 1 999-2000 and 2000-2001 academic 
years, and had more than 500 student records each year. 
Institutions with fewer than 500 student records were 
omitted from the selection pool, because small sample 
sizes in non-persistence groups could result in 
independent variables nearly completely predicting 
persistence. This situation could also arise when there 
are large imbalances between the percentage of students 
persisting and not persisting. Such “quasi-complete 
separation of data points” would inhibit maximum 
likelihood estimation of parameters and logistic model fit 
could not be achieved. 

This study used data from consecutive years because 
of data availability. However, studies such as this 
conducted as part of an ongoing persistence research 
program will often skip a year in the data. This is because 
data would be gathered from applicants prior to enrollment 
in the fall of year one, and persistence status would not 
be known until the fall of year two (using the same 
definition of persistence as is used in this study). Those 
results would then be used for students in the fall of year 
three. A shorter period of time between groups of 
enrollees, such as was used in this study, may increase 
the degree of observed predictive stability, thereby biasing 
the present results toward conclusions of model stability. 

All institutions in the study were four-year institutions. 
Though study of persistence at two-year institutions is 
needed, such schools experience more complexity in 
defining persistence than four-year institutions because 
of relatively high transfer rates, percentages of part-time 
students, percentages of students seeking professional/ 
personal enrichment (rather than a degree), etc. 
Subsequent research based upon the present study may 
look at two-year institutions. 

Because the four-year institutions in the sample pool 
varied in terms of size, type of governance, selectivity, 
and other factors, it was decided that randomly sampling 
one school from each of five selectivity classifications 
would control the impact of selectivity, and give other 
relevant institutional characteristics an equal chance of 
being represented in the results. For the purposes of this 
study, institutions were initially classified into five selectivity 
groups based upon the criteria in Table 1, but the final 
classification was based upon institutional self-report. 



Table 1 

Selectivity Definitions 



Typical Pre-College Academic Characteristics of Student Body 1 


Self-Reported Admissions 
Selectivity 


Interquartile 
Range 
of Avg. ACT 
Comp. Scores 


High School Class Rank 


Highly Selective 


27-31 


Majority of students in top 10% of h. s. graduating class 


Selective 


22-27 


Majority in top 25% 


Traditional 


20-23 


Majority in top 50% 


Liberal 


18-21 


Some students from lower 50% 


Open 


17-20 


All high school graduates accepted, to capacity 
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Institutional self-report resulted in mean ACT scores 
at the colleges in this study not necessarily falling within the 
typical ranges in Table 1 (compare to Table 2). No highly 
selective institutions with more than 500 records had 
persistence data for both the 1999-2000 and 2000-2001 
school years, so data from only four schools, one from 
each of the remaining selectivity classes, were used. 

Table 2 presents the number of student records 
available and percent of students returning for a second 
year for each institution per school year. 



Table 2 

Institution Characteristics by Admissions Policy 





Number of Complete Student Records Available | 


Self-Reported 

Admissions 

Selectivity 


1999-2000 Data 




2000-2001 Data | 


N 


Pet. 

Return 


Mean ACT 
Composite 


N 


Pet. 

Return 


Mean ACT 
Composite 


Selective 


3,509 


85 


24.3 


3,426 


87 


24.5 


Traditional 


817 


78 


23.2 


908 


79 


23.3 


Liberal 


1,049 


81 


22.7 


1,433 


80 


22.8 


Open 


1,038 


68 


20.2 


1,091 


73 


20.1 



Within selectivity groups, only one institution was 
randomly selected. This was done for two reasons. 
First, for consistency with other persistence research, 
this study was looking at persistence within institutions. 
This required repetition of all analyses as many times as 
there were institutions. Second, because the sample 
pool lacked representativeness of any readily defined 
population of institutions, generalizability was curtailed. 
As such, the additional computational complexity of 
including individual analyses of more institutions would 
add little in terms of interpretation and use of results. 

Definition of Dependent Variable: Persistence 

A student described as a “persister” was one who 
initially came to campus in the fall was continuously enrolled 
up to and including the following fall. All others who took 
a “break” or permanently left before enrolling for the following 
fall were defined for this paper to be non-persisters. This 
contrasts with other common definitions, such as when a 
persister enrolls in consecutive fall terms — even if he/she 
does not enroll in the interim spring term. 

The decision to use a definition of persistence relative 
to the second year was made for three reasons. First, 
undergraduates who complete their first year of education 
and subsequently reenroll in their second year have a 
higher likelihood than not of obtaining their degree (Horn 
& Carroll, 1998). Second, nearly a third of all students 
leave postsecondary education before reaching their 
second year. This is a higher proportion than all other 
years combined (Horn & Carroll). Third, the first to 
second year attrition rate is generally the most significant 
determinant of ultimate graduation rate for an institution 
(Levitz, Noel, & Richter, 1999). 



Independent Variable Definitions and Selection 

This study advocated identifying and using institution- 
specific predictors for within-institution persistence 
research. However, six predictors commonly found within- 
institution studies were chosen for use within each school. 
The primary reason for this was that the study’s purpose 
was to look at the applicability of a single prediction 
equation and its constituent variables based on a common 
model for two separate years of students, rather than 
identifying an optimal model. The technique, in turn, 
could be replicated by practitioners using variables 
appropriate for their own student populations. Though 
the common model may not be optimal for a single 
institution, the predictors were selected on the basis of 
findings in other articles and substantive concerns. The 
intent of their use was for the present results to have 
some applicability to different institutions, while recognizing 
that better predictors may be available for local campuses. 
In practice, institutions looking to replicate the present 
study with their own students may wish to look beyond 
the more global set of predictors and employ relevant 
institution-specific variables. 

The six predictor variables included in this study were 
selected because they have been used in other research 
or were of interest to broaden the demographic coverage 
of the model. Furthermore, all were pre-college variables. 
The first predictor was in-state/out-of-state, reflecting 
students’ state residency statuses. The second variable 
was commitment to one’s major, measured using a three- 
point Likert rating of how sure students were of an intended 
major that they listed when applying to take the ACT 
Assessment. Third, this study looked at the choice of 
colleges students were attending. Students who were 
attending their first or second choice institution were 
grouped, and students attending their 3 rd -6 ,h choice were 
also grouped, thereby creating a dichotomous institution 
choice variable. 

A fourth predictor was gender. The fifth predictor was 
race/ethnicity. Because of model convergence problems 
with more specific classifications, race/ethnicity was 
dichotomized into Caucasian/Minority for this study. 

Though high school GPA is often found to be an 
efficient pre-college predictor of persistence, the sixth 
predictor selected was ACT Composite instead. This 
decision was made because ACT Composite score and 
high school GPA are both often found to be efficient 
predictors of persistence in-and-of themselves, but 
collinearity between the two could give rise to results that 
mask the efficiency of either predictor. Because the 
researcher was more concerned with the stability of ACT 
Composite than high school GPA as a predictor and did 
not want the issue of collinearity to be a factor, ACT 
Composite score was the only pre-college cognitive 
variable used. 

In practice, the selection of variables depends greatly 
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on the purpose of one’s persistence study. In the present 
study, variables were selected only as indicators of risk 
for non-persistence with no accompanying interventions 
for at-risk students. Under these conditions, one of the 
most important factors in selecting variables is whether 
the predictive efficiency of the overall model is maximized. 
Should there be interventions, however, we need to attend 
not only to getting an efficient model, but also to having 
variables that can be manipulated through planned 
intervention. Because only two of the present variables 
can be manipulated (e.g., remedial courses can improve 
academic proficiency, and career/major counseling 
programs can improve student/major fit), this model would 
be of limited use in helping an institution to appropriately 
implement persistence-enhancing interventions. 

Scope of Investigation 

This study will focus on predicting persistence using 
pre-college data for incoming freshmen. This pre-college 
focus is based on the notion that students at-risk for non- 
persistence may require intervention within the first few 
weeks of the freshman year (Zhang & RiCharde, 1998). 
At-risk students often require immediate attention, such 
as special recruitment, admissions, orientation practices 
(e.g., placing special emphasis on clearly communicating 
what the institution expects from the students, and what 
students can expect from the institution; see Brawer, 
1996; Kim & Sedlacek, 1996; Kuh, 1991; Tinto, 1993), 
community-building activities (Tinto), mentoring programs 
(Brawer), advising programs (Wang & Grimes, 2000), 
and the like. These activities/programs require resource 
expenditure in order to be successful. As such, institutions 
want to be confident that their resources are being spent 
on correctly identified students. 

Analysis 

In order to satisfy the purpose of this study, there were 
two foci of analysis. The first focus was on the general 
stability of each variable’s predictive efficiency. Logistic 
regression was performed on 1 999-2000 data, and again 
on 2000-2001 data for each of the institutions using the 
set of predictors identified earlier. Stability of each 
variable’s predictive efficiency was assessed through the 
use of a chi-square test for the significance of the 
difference between non-standardized regression 
coefficient magnitudes across years. Because non- 
standardized logistic regression coefficients are 
asymptotically normal when necessary assumptions are 
met, dividing the difference between paired coefficients 
in each model by a pooled standard error and then 
squaring the result gives a statistic that is distributed as 
a chi-square with one degree of freedom. Statistically 
significant values for this statistic (i.e., p^. 05) for a given 
variable might be considered evidence arguing against 
use of the variable, as it would suggest that the variable 
functions differentially depending on the cohort. However, 



this is not the case if discrepancies are statistically 
significant, yet the variable predicts persistence at a 
statistically significant level of efficiency in both years 
(see discussion of ACT Composite in Results section). 
One should note that the statistic described above is not 
the Wald Test, as the Wald is not intended to test the 
difference between two parameters in equations derived 
from separate samples. 

A second focus of the analyses dealt with determining 
whether the use of the previous year’s equation would 
effectively predict persistence in a new sample 
(Kleinbaum, Kupper, & Muller, 1988). A cross-validation 
procedure was run analogous to that described by 
Pedhazur (1982) for ordinary least-squares regression. 
The 1999-2000 data served as the calibration sample 
and the 2000-2001 data as the validation sample. After 
obtaining separate logistic equations for the 1999-2000 
and 2000-2001 data using the same set of predictors, the 
resulting equations were both applied to 2000-2001 data 
in order to generate two separate sets of predicted 
persistence statuses (e.g., predicted persisters and 
predicted non-persisters). 

Within each institution, separate 2x2 contingency tables 
were created by crossing predicted and actual persistence 
status based on results of applying the calibration (i.e., 
1999-2000) and validation (i.e., 2000-2001) sample 
equations to the validation data (see Table 3). 

This process was replicated for each “predicted 
persister” criterion described later. These tables were 



Table 3 

Example 2x2 Persistence Status Table 





Predicted Persistence Status 


Persisters 


Non-Persisters 


Actual 

Persistence 

Status 


Persisters 


N Correctly 
Predicted 
Persisters 


N Incorrectly 
Predicted 
Persisters 


Non-Persisters 


N Incorrectly 
Predicted 
Non-Persisters 


N Correctly 
Predicted 
Non-Persisters 



compared according to accuracy of predicting persisters 
and non-persisters separately, as well as through accuracy 
rates (i.e., overall percentage of correct predictions). 
This author has been unable to find a set of criteria to use 
when classifying differences in accuracy rates as being 
small, medium, large, etc. As a result, this question has 
to be considered from a more pragmatic, institution- 
specific perspective. 

To facilitate understanding of accuracy rates from a 
pragmatic perspective, suppose there are two fictitious 
institutions, A and B. Institution A has admitted 100 
students, whereas Institution B has admitted 10,000 
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students. Furthermore, suppose that each institution 
attempts to identify students at-risk for non-persistence, 
and has data available to perform a cross-validation 
study such as exemplified by the present study. Assume 
that in both cross-validation studies, an increase of only 
one percent in accuracy rate is observed when using 
present- vs. prior-year parameters. 

At Institution A, the one percent accuracy rate increase 
means that only one additional student would be incorrectly 
classified if the institution continued to use the prior 
year’s parameters for prediction. At Institution B, however, 
that same one percent would result in 1 00 students being 
incorrectly classified when using the prior year’s 
parameters — many students would receive services that 
were unnecessary, and/or many would not receive 
necessary services. As institutional leaders, one of the 
tough decisions to make is whether the number of students 
affected by differences in accuracy rates is large enough 
at a specific institution to revisit how one uses a statistical 
model for identifying at-risk students. 

Finally, the two 2x2 tables within each institution and 
criterion classification were combined to form 2x2x2 
tables, representing the following three crossed 
dimensions: (prediction equation year — ‘99-'00, ‘OO-'OI) 
x (predicted persistence — predicted left, predicted stayed) 
x (actual persistence — actually left, actually stayed) (see 
Table 4). 



Table 4 

Example 2x2x2 Persistence Status Table 





Predicted Persistence Status 


Parameter Year '99-'00 


X 


Parameter Year 'OO-'OI 


Persisters 


Non-Persisters 


Persisters 


Non-Persisters 


Actual 

Persistence 

Status 


Persisters 


N Correctly 
Predicted 
Persisters 


N Incorrectly 
Predicted 
Persisters 


N Correctly 
Predicted 
Persisters 


N Incorrectly 
Predicted 
Persisters 


Non-Persisters 


N Incorrectly 
Predicted 
Non-Persisters 


N Correctly 
Predicted 
Non-Persisters 


N Incorrectly 
Predicted 
Non-Persisters 


N Correctly 
Predicted 
Non-Persisters 



A Breslow-Day test for homogeneity of odds ratios 
(Breslow & Day, 1980) was conducted on each 2x2x2 
table to determine whether odds ratios for 2x2 tables 
from each pair of parameter years differed. The Breslow- 
Day test is a standard result generated by some statistical 
programs. Flowever, some programs may not generate 
this statistic. In such situations, a viable alternative 
would be to use other chi-square tests that permit analysis 
of a 2x2x2 table. Discrepancies in prediction accuracy 
and/or rejected null hypotheses under the Breslow-Day 
test would support the argument that the two equations 
do not permit similar prediction accuracy levels, thereby 
calling into question the practice of using results from 
previous year’s students to make probabilistic predictions 
about students coming to campus for the present year. 

To establish predicted persistence status, estimated 



probabilities of persistence generated for the second 
analysis focus were put through three recoding 
procedures, as described in Sadler et al. (1 997). For the 
first criterion, if a student’s estimated probability equaled 
or exceeded 0.50, he/she was classified as a predicted 
persister. Otherwise, he or she was predicted not to 
persist. Second, students were also classified relative to 
an estimated probability of persistence of 0.70. Finally, a 
similar classification process was carried out relative to 
a probability criterion of p=0.85. Multiple definitions of what 
constituted a predicted persister were necessary because 
the accuracy of prediction results and relationships between 
predicted and actual persistence were hypothesized to 
vary as a function of the persistence criterion. 

Results 

Stability of Predictor Efficiency 

Absolute predictor efficiency. In determining 
accuracy of predicting persistence using the previous 
year’s model, the first focus in the analysis looked at how 
efficient each variable was in predicting retention from 
one year to the next. Table 5 shows that ACT Composite 
score stood out from the rest as a consistently significant 
predictor, where its odds ratios significantly differed from 
1 .0 for both years at all institutions. Other examples of 
efficient predictors were In-state/Out-of-state, Sureness 
of Major, and Gender. 

Changes in predictor efficiency. As seen in Table 
5, nearly all differences between model parameters were 
statistically Insignificant at p = .05 from one year to the 
next, regardless of predictor and institution. The only 
exception was ACT Composite at the open institution, 
where the p-value was 0.027. Though the difference was 
statistically significant, ACT Composite score was still an 
efficient predictor for both years. Even with a significant 
difference in efficiency, the fact that it was a statistically 
significant predictor in both years allowed Composite 
scores to maintain their viability for use in a model. 

As seen in Table 5, the direction of relationships between 
predictors and persistence sometimes varied between years 
(e.g., odds ratios changed from being less than one to 
greater than one, or vice versa). For instance, Caucasian/ 
Minority group membership was positively associated with 
persistence in 1999-2000, but negatively in 2000-2001 at 
the traditional institution. On the other hand, ACT Composite 
score was consistently positively associated with 
persistence, where higher scores were associated with a 
greater probability of persisting. 

Predictive Accuracy: Cross-Validation 

Critical to the investigation of persistence prediction 
accuracy is the second analysis focus: stability of 
prediction accuracy. This stability will be described from 
two different vantage points: comparing among predicted 
persistence criteria, and comparing among calibration/ 
validation parameters. 
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Table 5 

Comparison of Logistic Regression Parameters 





1999-2000 Data 
(Calibration Sample) 




2000-2001 Data 
validation Sample) 


Variable 


Parm Est. 


Pr. Chi- 
Sq. 


m 




Parm Est. 


Pr. Chi-Sq. 


m 


N =3,509 Selective Institution N=3,426 


Intercept 


-1.030 


0.009 






-1.277 


0.001 




In/Out State 


0.009 


0.965 


1.009 




0.405 


0.037 


1.500 


College Choice 


0.135 


0.269 


1.145 




0.076 


0.582 


1.079 


Sure of Ed. Major 


0.295 


0.000 


1.343 




0.266 


0.000 


1.305 


Gender 


-0.161 


0.095 


0.851 




-0.245 


0.018 


0.783 


Cauc./Minority 


-0.030 


0.799 


0.970 




0.045 


0.727 


1.046 


ACT Composite 


0.093 


0.000 


1.097 




0.096 


0.000 


1.101 


N=817 Traditional Institution N=908 


Intercept 


-2.169 


0.000 






-1.294 


0.023 




In/Out State 


0.435 


0.019 


1.545 




0.601 


0.001 


1.824 


College Choice 


0.263 


0.257 


1.300 




0.308 


0.215 


1.360 


Sure of Ed. Major 


0.093 


0.438 


1.098 




0.407 


0.001 


1.502 


Gender 


-0.258 


0.141 


0.773 




-0.227 


0.175 


0.797 


Cauc./Minority 


0.480 


0.038 


1.615 




-0.137 


0.575 


0.872 


ACT Composite 


0.110 


0.000 


1.116 




0.063 


0.003 


1.065 


N=1,049 Liberal Institution N=1,433 


Intercept 


-1.374 


0.013 






-0.764 


0.094 




In/Out State 


0.444 


0.027 


1.559 




0.078 


0.650 


1.081 


College Choice 


0.214 


0.308 


1.238 




0.162 


0.419 


1.176 


Sure of Ed. Major 


0.165 


0.143 


1.179 




-0.015 


0.870 


0.985 


Gender 


-0.166 


0.304 


0.847 




-0.404 


0.003 


0.667 


Cauc./Minority 


-0.164 


0.418 


0.849 




0.037 


0.823 


1.037 


ACT Composite 


0.098 


0.000 


1.103 




0.096 


0.000 


1.101 


N=1,038 Open Institution N=1,091 


Intercept 


-1.408 


0.023 






-0.932 


0.098 




In/Out State 


-0.269 


0.552 


0.764 




0.235 


0.534 


1.265 


College Choice 


0.050 


0.771 


1.051 




0.462 


0.008 


1.587 


Sure of Ed. Major 


0.118 


0.210 


1.125 




0.181 


0.059 


1.199 


Gender 


-0.032 


0.814 


0.968 




-0.380 


0.006 


0.684 


Cauc./Minority 


0.284 


0.128 


1.329 




0.424 


0.032 


1.528 


ACT Composite 


0.097 


0.000 


1.102 




0.041 


0.022 


1.042 



Test for Diff. In Parm. 
Est. 



Chi-sq. Pr. Chi-Sq. 




1.868 


0.172 


0.105 


0.746 


0.082 


0.774 


0.348 


0.555 


0.184 


0.668 


0.030 


0.863 




0.423 


0.516 


0.018 


0.894 


3.469 


0.063 


0.016 


0.899 


3.362 


0.067 


2.354 


0.125 




1.915 


0.166 


0.032 


0.858 


1.525 


0.217 


1.280 


0.258 


0.593 


0.441 


0.005 


0.946 




0.733 


0.392 


2.833 


0.092 


0.227 


0.634 


3.188 


0.074 


0.266 


0.606 


4.884 


0.027 
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Comparisons of prediction accuracy among 
persistence criteria. As expected, the use of logistic 
regression as a basis for modeling student persistence 
resulted in the highest accuracy rates being observed 
when using the p= 0.50 criterion, regardless of institution. 

Results using the p=0.50 criterion. When predicted 
persisters were classified according to whether their p 
exceeded 0.50, large differences were not observed in 
prediction accuracy under the use of the two sets of 
parameters (see Table 6). 

Results using the p=0.70 criterion. Under the p= 0.70 
criterion, accuracy rates for each parameter set differed 
little for the selective and liberal institutions, and only 
mildly for the traditional institution, with the largest 
difference being for the open institution. Yet, these results 
can be misleading, in that the traditional institution, rather 
than the open institution, had a statistically significant 
Breslow-Day statistic (see T able 6) . This apparent contrast 
is resolved by considering that the Breslow-Day statistic 
in this table does not focus on accuracy rates, but on 
odds ratios between the 2x2 tables defined for predicted 
results from each set of parameters. As such, the greater 
variability in individual cell percentages for the traditional 
institution’s tables accounted for the significant result. 

Given the results above, had one simply reviewed 
accuracy rates, the significant difference in odds ratios 
would have been overlooked. Yet, using calibration 
parameters vs. validation parameters cut the accuracy of 
predicting non-persisters by about a third at the open 
institution. These results illustrate the point that institutions 
have to determine what is more important: accurately 
identifying non-persisters or persisters separately, or 
maximizing accuracy rates. Then, they have to balance 
their priority against how stable the predicted vs. observed 
persistence relationship is from one year to the next. 

Results using the p= 0.85 criterion. For the selective 
institution, differences existed between percent correct 
predictions of persisters and non-persisters, depending 
on the parameters used. However, this difference in 
predictive accuracy did not result in a significant Breslow- 
Day test ( p = 0.575). Percentages for liberal and traditional 
schools were relatively stable, regardless of parameters 
used. Accuracy rates among different parameter 
combinations differed little for the liberal and traditional 
schools, though the open and selective institutions differed 
by 6.9% and 8.3% among parameter 
combinations. Furthermore, percentages of correctly 
predicted persisters and non-persisters were relatively 
stable. The Breslow-Day tests were statistically 
insignificant as well (see Table 6). 

Results for the open institution warranted further 
consideration, as a counter-intuitive result emerged: the 
accuracy rate was higher using 1999-2000 parameters 



than 2000-2001 parameters with the 2000-2001 data 
(28.3% and 21.4%, respectively). Why this result occurred 
is unclear, though it is likely related to the drop in number 
(as opposed to percent) of predicted persisters. 
Regardless, odds ratios between parameter years were 
very similar to each other, suggesting that taken as a 
whole, the predicted vs. actual persister relationship 
changed little between parameter years. 

Discussion 

Considerations for Using Statistical Techniques 

This study was not intended to provide broad 
generalizations across many institutions, as persistence 
research is often best done within a single institution, and 
the variables selected for the present analyses were not 
necessarily optimal for the selected institutions. With this 
caveat in mind, the conclusions will be discussed below. 

When considering all of the results above, several 
issues stand out. First, for some institutions, the use of 
the previous year’s prediction equation on the present 
year’s data will provide similar results to using the present 
year’s equation. Yet, this is not always the case, as could 
be seen by the traditional institution under a p= 0.70 
criterion. Also, the same predictors may have similar 
efficiencies from one year to the next at some institutions, 
but not at others. 

For institutions that do not have extremely high or low 
persistence rates, the use of a criterion of p= 0.50 may be 
the most advisable for several reasons. First, when logistic 
models are used and all parameters converge, this criterion 
will provide the maximum accuracy rates. Second, this 
criterion is more easily understood by non-technical 
colleagues, as it simply represents whether students have 
a higher probability than not of returning. Though this 
criterion may have more falsely identified at-risk students 
than other criteria (an assertion founded on raw results on 
which Table 6 is based), more students will receive 
persistence-targeted programming, thereby reaching out 
to those students who might otherwise be overlooked. 

The downside to using a criterion of p=0.50 is that as 
many as half of the incoming freshmen are identified as 
being at-risk, a proportion representing many students. 
As such, relatively large amounts of financial and 
personnel resources are necessary to provide prescribed 
interventions. An unfortunate reality experienced by many 
institutions is that resources are limited. Thus, the 
selection of criterion to use when identifying students at- 
risk for non-persistence requires balancing priorities of 
serving as many students in as accurate a manner as 
possible, yet within a context of limited resources. 

Another important issue to consider is selection of 
predictors. Ideally, pre-college predictors should be 
consistently efficient predictors of persistence and be 
readily available prior to arrival on campus. At the same 
time, when persistence programming will be provided for 





AIR Professional File, Number 93, Cross-Validation of.... 



9 



Table 6 

Accuracy of Predicted Persistence Using Calibration (99-00) and Validation (00-01) 

Parameters on 00-01 Data 





% Correct Pred. Using 99-00 
Parms 




% Correct Pred. Using 00-01 
Parms. 




Breslow-Day Results 


Admissions Policy 


Persisters 


Non- 

Persisters 


Accuracy 

Rate 




Persisters 


Non- 

Persisters 


Accuracy 

Rate 




N 


Chi-Square 


P-Value 


P-hat=.50 Criterion 




Selective 


100.0% 


0.0% 


86.7% 




100.0% 


0.2% 


86.7% 




3,426 


N- 


N- 


Traditional 


99.4% 


2.6% 


79.1% 




99.9% 


0.0% 


78.9% 




908 


1.130 


0.288 


Liberall 


100.0% 


0.0% 


79.9% 




100.0% 


0.0% 


79.9% 




1,433 


N- 


N- 


Open 


98.1% 


5.1% 


72.8% 




99.5% 


1 .4% 


72.8% 




1,091 


0.001 


0.976 


P-hat=.70 Criterion 




Selective 


99.1% 


3.5% 


86.4% 




99.3% 


1 .8% 


86.3% 




3,426 


0.889 


0.346 


Traditional 


81.7% 


28.8% 


70.6% 




88.3% 


29.3% 


75.9% 




908 


4.082 


0.043 


Liberal 


94.5% 


8.7% 


77.3% 




90.6% 


18.1% 


75.9% 




1,433 


0.714 


0.398 


Liberal 


94.5% 


8.7% 


77.3% 




90.6% 


18.1% 


75.9% 




1,433 


0.714 


0.398 


P-hat=.85 Criterion 




Selective 


59.8% 


57.2% 


59.5% 




70.6% 


47.4% 


67.5% 




3,426 


0.314 


0.575 


Traditional 


30.5% 


85.3% 


42.1% 




29.7% 


83.8% 


41.1% 




908 


0.274 


0.601 


Liberal 


31.4% 


80.2% 


41.2% 




30.5% 


84.0% 


41 .2% 




1,433 


0.867 


0.352 


1 Denotes results that were identical, regardless of which year's parameters were used. 



at-risk students, at least some of the predictors should be 
manipulable. In other words, the student characteristic(s) 
described by the predictor(s) should be manipulable. 

An example of a suitable predictor was ACT Composite 
score. In this study, ACT Composite was a stable, 
efficient predictor of persistence at each school, and was 
available prior to the arrival of most students to campus. 
Furthermore, ACT Composite scores reflect student 
academic characteristics that can be manipulated through 
assignment to coursework at appropriate levels (e.g., 
remedial or standard courses). 

Along with the practical issues above, there are important 
statistical considerations that require attention as well. When 
selecting variables for inclusion into logistic predictive 
models, it is important to consider how highly potential 
variables correlate with one another within the sample. 
Collinearity can result in inflated variance estimates for 
model parameter estimates, wrong signs and magnitudes 
of these parameters, and other troublesome outcomes. 

While there is really no set standard for how high the 
intercorrelation needs to be before one worries about the 
impact of collinearity, it is well advised to perform 
collinearity diagnostics in any logistic regression analysis 
and deal with collinear variables that emerge based on 
rules appropriate for each diagnostic procedure. As 
discussed earlier, ACT Composite score and high school 
GPA have a non-trivial correlation with one another, and 



both serve as efficient pre-college predictors of 
persistence. This correlation, or collinearity, would limit 
the stability of regression weights if both were included. 
This situation can result in misleading outcomes when 
trying to decide whether to use one variable or the other. 
If one’s interest is in overall model efficiency instead of 
the functioning of specific variables, the inclusion of 
collinear variables is not a problem. One approach to 
this inclusion would be to create a single predictor that 
represents a combination of the two variables. In this 
way, the predictive efficiency of both variables is included 
in the model. One drawback is that information is lost 
regarding the efficiency of both variables individually. 
Should one wish to use only one of the predictors, a 
simple solution is to run the full model first with one of the 
collinear variables, and then re-run the full model with the 
second collinear variable — but not both. Then, simply 
include the predictor that had the most impact on the 
efficiency of the full model. 

Considerations for Interpreting Statistical Results 

Though it may be tempting to simply look at the 
probability of persistence as an identifier of students at- 
risk for non-persistence, this practice has limitations 
associated with it. For instance, there is the possibility of 
erroneously assuming similar predicted/actual 
relationships from one parameter year to another when 
using a model to determine appropriate interventions. 
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A second issue is that moderate-to-substantial 
differences in average p values between parameter years 
can impact the accuracy of identifying at-risk students. 
Based on a result in Hosmer and Lemeshow (1989), an 
institution’s average 1 ^ value for a given year is equal to 
the proportion of students persisting in the same year. 
As such, historical patterns can be investigated for 
persistence rate stability by simply reviewing past trends 
in persistence rates. If stability is not observed in past 
years, then other approaches (e.g., non-statistical reviews 
of student records) should be employed. Simply starting 
the process in a given year and using the probability 
approach without first determining the stability of 
persistence rates may give rise to inaccurate or misleading 
results. 

This study also demonstrated a further consideration: 
simply looking at selected percentages of correct 
classifications in absence of the “big picture” could be 
misleading. This conclusion stemmed from varied 
percentages in correctly predicted persisters, non- 
persisters, and accuracy rates as shown in Table 6, yet 
few significantly different odds ratios. From these results 
we can draw a conclusion that looking at only part of the 
picture can give different results than looking at the 
whole picture. 

One reality often faced in postsecondary institutions is 
that simply identifying students as being at-risk is not 
helpful. With a limited amount of resources, only a 
portion of an at-risk student cohort may be able to receive 
services. Thus, researchers are often asked to select 
from the at-risk cohort a subgroup for whom persistence- 
related programs/resources would be most at-risk. One 
approach can be accomplished by first rank ordering the 
students from “most at risk” to “least at risk” on the basis 
of estimated conditional probabilities of persisting (p) 
Once ranked, researchers can then flag a specific 
percentage of students with the highest risk levels and 
assign them to programs. Note, this approach is over- 
simplified, as more information than just statistics may 
be needed for accurate assignments. 

Considerations for Persistence Research 
Implementation: An Example 

An example of using more than just a probability to 
identify at-risk students is a delayed compensatory at- 
risk model. Let us suppose that the probability modeling 
approach is used to identify at-risk students before they 
come to campus, and that all of the statistical concerns 
identified above have been dealt with as much as possible. 
Without any other information, the probability modeling 
approach may be all an institution can do for identifying 
incoming at-risk students. 

An alternative to immediately assigning students to at- 
risk programs based on statistical analyses would be to 
also implement one-on-one contact with RAs, academic 



advisors, faculty, and general outreach services, similar 
to the “three week checkup” for every new student as 
described in Nichols, et al. (1998). These contacts can 
be used to gather information about important predictors 
of persistence such as intent to leave the institution 
(Bean, 1982), attitudes about the social and academic 
environments (Bean, 1982), and fit between personal 
goals and actual experiences on campus. 

As a conservative approach, any students identified 
as being at-risk by the personal contacts or by a given p 
criterion might be invited to partake in persistence- 
enhancing programming. On the other hand, students 
may be considered at-risk if both p and contacts 
recommend at-risk classification. Whether students are 
invited to initiate or continue participation in at-risk 
programming then becomes a decision based upon 
multiple sources of evidence, some of which may override 
the probability model in importance for decision-making. 

Conclusion 

In the end, promoting student persistence is not a 
simple task. The literature recommends that early 
intervention with students at-risk for non-persistence 
constitutes resources well spent (Sadler, et al., 1997; 
Tinto, 1993). Yet, some institutions have student bodies 
with characteristics that challenge attempts to identify at- 
risk students based on pre-college data. An institution 
may have stable predictive relationships from one year to 
the next or it may not, depending on selected predictors, 
classification criteria, base persistence rates, and 
admissions policies. If we are to use persistence 
prediction research to guide programming, we have to 
conduct proper preparatory analyses to ensure that our 
incoming student cohorts do not have a history of extreme 
nor rapidly changing persistence rates. Once this 
information is obtained, we need to regularly perform 
cross-validation analyses, or run the risk of assuming 
stable predictive relationships when such an assumption 
is not tenable. 

As discussed above, using a probabilistic approach can 
have problems associated with it. Therefore, this approach 
would best be used in conjunction with other indicators of 
non-persistence risk, such as post-enrollment academic 
variables, personal contact, and the like. This information 
can be obtained through means as informal as local record 
keeping and RA visits, or as formal as using retention 
reporting services provided by institutions such as ACT. 

Along with ongoing research, programs designed to 
promote persistence need to be in place at more points 
in time than just the initial contact with the campus (Sadler, 
et al., 1997). Failing to account for changes in the 
student body, both at admission and as students progress 
through school, increases the risk of using antiquated 
interventions on the wrong group of students. Flowever, 
continued adjustment of programs based upon results 
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from research can result in substantial rewards for the 
institution, and ultimately, the students. 

Recommendations for Future Research 

Future research using the present methods may 
consider investigating the applicability of predictive models 
across types of institutions is important. The typology of 
institutions can be general or specific. For instance, 
institutions can be stratified based on selectivity, as done 
in this study. Or, they may be classified according to 
other characteristics, such as 2/4 year, public/private, 
etc. Regardless of how typologies are defined, information 
about and strategies for improving student persistence 
are in the best interests of institutions and students. 
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